Spatial Analysis

This is an R markdown document showing very basic spatial analysis using Seurat v5. The data is from a healthy mouse brain downloaded from 10X datasets. The data was processed using Space Ranger on the palmetto cluster.

This analysis was performed on the palmetto cluster using the biocm-seurat-v5 singularity image.

suppressMessages(suppressWarnings({library(Seurat)
  library(SeuratData)
  library(ggplot2)
  library(patchwork)
  library(dplyr)
  library(Matrix)
  library(CARD)
}))

Loading the data directly from the Space Ranger outputs.

seurat <- Load10X_Spatial(
    data.dir = '/mnt/bg/spatial/sr_out/Visium_FFPE_Mouse_Brain/outs',
    filename = "filtered_feature_bc_matrix.h5",
    assay = 'Spatial')

seurat
## An object of class Seurat 
## 19465 features across 2237 samples within 1 assay 
## Active assay: Spatial (19465 features, 0 variable features)
##  2 layers present: counts, data
##  1 image present: slice1

Once the data is in the seurat object, a few simple QC plots can be generated.

p1 <- VlnPlot(seurat, features = "nCount_Spatial", pt.size = 0.1) + NoLegend()
p2 <- SpatialFeaturePlot(seurat, features = "nCount_Spatial") + theme(legend.position = "right")
wrap_plots(p1, p2)

Perform SCT on data

seurat <- SCTransform(seurat, assay = "Spatial", verbose = F)

Examining the spatial expression of various genes using SpatialFeaturePlot

p1 <- SpatialFeaturePlot(seurat, features = "Il1rapl1")
p2 <- SpatialFeaturePlot(seurat, features = "Adam33")
p3 <- SpatialFeaturePlot(seurat, features = "Vgf")
wrap_plots(p1, p2, p3)

Now running the typical dimensionality reduction and clustering algorithms

seurat <- RunPCA(seurat, assay = "SCT")
## PC_ 1 
## Positive:  Trf, Mobp, Mbp, Plp1, Ptgds, Enpp2, Car2, Ttr, Apod, Cnp 
##     Plaat3, Mal, Slc4a2, Plekhb1, Cryab, Zic1, Dbi, Mag, Igf2, Pcp4l1 
##     Fxyd1, Cldn11, Tmem98, Clic6, Cd82, Elovl7, Ecrg4, Pltp, Car14, Col9a3 
## Negative:  Nrgn, Slc17a7, Camk2n1, Enc1, Olfm1, Hpcal4, Nptxr, Ddn, Chn1, Camk2a 
##     Itpka, Vxn, Arc, Mef2c, Psd, Hpca, Ctxn1, Slc30a3, Snap25, Arpp21 
##     Cck, Baiap2, Icam5, Fam131a, Snca, Lamp5, Ngef, Ppp3ca, Rtn4rl2, Cnksr2 
## PC_ 2 
## Positive:  Prkcd, Plekhg1, Rora, Pcp4, Adarb1, Rgs16, Ramp3, Zic1, Tnnt1, Ptpn3 
##     Ntng1, Synpo2, Ptpn4, Slc17a6, Tcf7l2, Amotl1, Kcnc2, Plcb4, Rab37, Abhd12b 
##     Rims3, Cplx1, Ccdc136, Rasd1, Grm1, Patj, Slc6a11, Nrip3, Pdp1, Zdhhc22 
## Negative:  Ttr, Ppp1r1b, Ddn, Ezr, Hpcal4, Car12, Slc29a4, Nrgn, Enpp2, Clic6 
##     Igf2, Spint2, Ecrg4, Rbp1, Sulf1, Fxyd1, Folr1, Lbp, Kl, Drc7 
##     Ace, Gm19935, Prelp, Abhd2, Rsph1, Tbc1d9, Calml4, Nptxr, Mfrp, Ephx1 
## PC_ 3 
## Positive:  Ptpn3, Rora, Rgs4, Prkcd, Pcp4, Zic1, Pcp4l1, Adarb1, Rgs16, Cck 
##     Epn3, Ramp3, Synpo2, Ptpn4, Sh3d19, Tnnt1, Eps8l2, Ntng1, Plekhg1, Abhd12b 
##     Amotl1, Plcb4, Rasd1, Kcnc2, Pdp1, Gabrd, Gabra4, Rab37, Patj, Rims3 
## Negative:  Plp1, Mobp, Mbp, Cnp, Mal, Mag, Cldn11, Gfap, Fth1, Plekhb1 
##     Apod, Gatm, Tspan2, Mog, Fa2h, Sept4, Bcas1, Ermn, Qdpr, Ppp1r14a 
##     Cryab, Opalin, Ptgds, Ugt8a, Tppp3, Gpr37, Car2, Tmem88b, Gad2, Myrf 
## PC_ 4 
## Positive:  Nnat, Hap1, Cpne7, Hpcal1, Ly6h, Lypd1, Zcchc12, Doc2b, Pcdh8, Calb2 
##     Necab2, Cpne6, Gria1, Slit1, Nptxr, Baiap3, Hpcal4, Ahi1, Kctd4, Ak5 
##     Camk2d, Ecel1, Npy2r, Ddn, Calb1, Bhlhe22, Acvr2a, Marcksl1, Nrp2, Crym 
## Negative:  Arc, Camk2n1, Satb2, Lamp5, Nrgn, Vxn, Ier5, Cabp1, Pvalb, Satb1 
##     Snap25, Ngef, Egr1, 1110008P14Rik, Stac2, Atp2b2, Gm11549, Mef2c, Kcnh5, Prkcb 
##     Dkk3, Mbp, Atp1a1, Nr4a1, Epop, Cd34, Lmo4, Myl4, Kcnab3, Osbpl1a 
## PC_ 5 
## Positive:  Resp18, Zcchc12, Nap1l5, Baiap3, Calb2, Hap1, Gap43, Ndn, Tmem130, Vat1l 
##     6330403K07Rik, AW551984, Peg10, Dlk1, Ahi1, Impact, Timp2, Nrsn2, Sparc, Nnat 
##     Scg2, Gprasp2, Peg3, Gpx3, Tmem255a, Itih3, Gaa, Plcxd2, Slc6a11, Wdr6 
## Negative:  Ddn, Camk2a, Gfap, Cabp7, Wipf3, Hpca, Shisa6, Psd, Rasd1, 2010300C02Rik 
##     Nr3c2, Nsmf, Zbtb20, Fth1, Ptk2b, Slc1a2, Mt1, Ppp1r9b, Galnt17, Neurod2 
##     Adcy1, Pcdh20, Ncdn, Mt2, Cnih2, Prox1, Epha7, Cst3, Kcnab2, Tmsb4x
seurat <- FindNeighbors(seurat, reduction = "pca", dims = 1:30)
## Computing nearest neighbor graph
## Computing SNN
seurat <- FindClusters(seurat)
## Modularity Optimizer version 1.3.0 by Ludo Waltman and Nees Jan van Eck
## 
## Number of nodes: 2237
## Number of edges: 57415
## 
## Running Louvain algorithm...
## Maximum modularity in 10 random starts: 0.8947
## Number of communities: 17
## Elapsed time: 0 seconds
seurat <- RunUMAP(seurat, reduction = "pca", dims = 1:30)
## 15:08:40 UMAP embedding parameters a = 0.9922 b = 1.112
## Found more than one class "dist" in cache; using the first, from namespace 'spam'
## Also defined by 'BiocGenerics'
## 15:08:40 Read 2237 rows and found 30 numeric columns
## 15:08:40 Using Annoy for neighbor search, n_neighbors = 30
## Found more than one class "dist" in cache; using the first, from namespace 'spam'
## Also defined by 'BiocGenerics'
## 15:08:40 Building Annoy index with metric = cosine, n_trees = 50
## 0%   10   20   30   40   50   60   70   80   90   100%
## [----|----|----|----|----|----|----|----|----|----|
## **************************************************|
## 15:08:40 Writing NN index file to temp file /tmp/Rtmpx8QGFq/file25f55f4c52d9c8
## 15:08:40 Searching Annoy index using 1 thread, search_k = 3000
## 15:08:41 Annoy recall = 100%
## 15:08:42 Commencing smooth kNN distance calibration using 1 thread with target n_neighbors = 30
## 15:08:43 Initializing from normalized Laplacian + noise (using RSpectra)
## 15:08:43 Commencing optimization for 500 epochs, with 81298 positive edges
## 15:08:45 Optimization finished

We can now look at the DimPlot in UMAP 2D space next to the spatial image with correlated regions

p1 <- DimPlot(seurat, reduction = "umap", label = TRUE)
p2 <- SpatialDimPlot(seurat, label = TRUE, label.size = 3)
## Scale for fill is already present.
## Adding another scale for fill, which will replace the existing scale.
wrap_plots(p1, p2)

To examine specific clusters more clearly, we can use the ‘cells.highlight’ parameter in SpatialDimPlot:

SpatialDimPlot(seurat, cells.highlight = CellsByIdentities(object = seurat, idents = c(0, 1, 2, 3, 4, 5)), facet.highlight = TRUE, ncol = 3, cols.highlight=c('red','grey75'))

We can also find spatial markers

de_markers <- FindMarkers(seurat, ident.1 = 0, ident.2 = 1)
SpatialFeaturePlot(object = seurat, features = rownames(de_markers)[1:3], alpha = c(0.1, 1), ncol = 3)

And the top spatially variable features:

seurat <- FindSpatiallyVariableFeatures(seurat, assay = "SCT", features = VariableFeatures(seurat)[1:1000],
    selection.method = "moransi")
## Computing Moran's I
top.features <- seurat@assays[["SCT"]]@meta.features |> as.data.frame() |>
  arrange(moransi.spatially.variable.rank) |> head(6) |> rownames()

SpatialFeaturePlot(seurat, features = top.features, ncol = 3, alpha = c(0.1, 1))

Now to inspect the ‘Spatial’ Assay:

seurat
## An object of class Seurat 
## 38587 features across 2237 samples within 2 assays 
## Active assay: SCT (19122 features, 3000 variable features)
##  3 layers present: counts, data, scale.data
##  1 other assay present: Spatial
##  2 dimensional reductions calculated: pca, umap
##  1 image present: slice1

Examining the counts layer in the Spatial Assay:

seurat@assays[['Spatial']]@counts[1:5,1:5]
## 5 x 5 sparse Matrix of class "dgCMatrix"
##        AACACTTGGCAAGGAA-1 AACAGGATTCATAGTT-1 AACAGGTTATTGCACC-1
## Xkr4                    .                  .                  .
## Rp1                     .                  .                  .
## Sox17                   .                  .                  .
## Lypla1                  .                  .                  2
## Tcea1                   2                  2                  5
##        AACAGGTTCACCGAAG-1 AACATCTAATGACCGG-1
## Xkr4                    .                  2
## Rp1                     .                  .
## Sox17                   .                  .
## Lypla1                  .                  1
## Tcea1                   6                  4

The feature matrix here looks like a typical RNA count matrix, but the barcodes (columns) are not cells—rather, they are spots on the slide. The rows are still genes, so the matrix shows the amount of genes detected in each spot on the capture slide. Each spot may contain multiple cells. In theory, similar cells will exist next to each other, but it takes further analysis to predict the cell type(s) found in each spot. This is called deconvolution and is presented below.

The coordinates for each spot in the image are located in the image layer:

seurat@images[["slice1"]]@coordinates[1:10,]
##                    tissue row col imagerow imagecol
## AACACTTGGCAAGGAA-1      1  47  71     1635     1052
## AACAGGATTCATAGTT-1      1  49  43     1671      746
## AACAGGTTATTGCACC-1      1  28  86     1276     1218
## AACAGGTTCACCGAAG-1      1  51  41     1709      724
## AACATCTAATGACCGG-1      1  67  55     2013      875
## AACCAAGGTATCAGGC-1      1  38 104     1468     1413
## AACCACTGCCATAGCC-1      1  29  49     1292      815
## AACCAGAATCAGACGT-1      1  59  77     1864     1116
## AACCATGAATTGCCTT-1      1  73  75     2129     1092
## AACCGCCAGACTACTT-1      1  39  45     1481      770

DECONVOLUTION

There are several tools that use probabilistic methods to predict the proportion of each cell type within each spot. For an overview of many of these tools (and those that use other methods), see: https://www.nature.com/articles/s41467-023-37168-7

Many of these tools require a scRNA-seq reference dataset, although a few can perform deconvolution without one (STdeconvolve, for example).

This analysis will use CARD, and R-based method for cellular deconvolution in spatial transcriptomics: https://github.com/YingMa0107/CARD

CARD requires the following inputs: -Spatial transcriptomic count matrix -Spatial coordinates (in a df with columns “x” and “y” for each cell) -single cell RNA count matrix -single cell metadata, which includes cell-type annotation and sample divisions

spatial_counts <- seurat@assays[['Spatial']]@counts
spatial_counts <- as(as.matrix(spatial_counts), "sparseMatrix")
spatial_locations <- seurat@images[["slice1"]]@coordinates |> 
  select(col, row) |> 
  dplyr::rename(x = col) |>
  dplyr::rename(y = row) 
head(spatial_locations)
##                      x  y
## AACACTTGGCAAGGAA-1  71 47
## AACAGGATTCATAGTT-1  43 49
## AACAGGTTATTGCACC-1  86 28
## AACAGGTTCACCGAAG-1  41 51
## AACATCTAATGACCGG-1  55 67
## AACCAAGGTATCAGGC-1 104 38

Note: the coordinates from space ranger start at the top left, while CARD assumes the bottom left is the origin. Thus, to have the image remain in the same orientation as the image file and outputs from Seurat analysis, we need to convert the y coordinates to negative values.

spatial_locations$y <- spatial_locations$y * -1

And for the single cell reference data - data is taken from: https://www.nature.com/articles/s41586-018-0654-5 via the python package squidpy.

For CARD, we need the raw counts and metadata from this data.

#load sc data
mouse <- readRDS('/mnt/bg/spatial/mouse_cortex_ref.rds')
sc_counts <- mouse@assays[['RNA']]$counts
sc_counts <- as(as.matrix(sc_counts), "sparseMatrix")

sc_meta <- mouse@meta.data

Create CARD object

CARD_obj = createCARDObject(
    sc_count = sc_counts,
    sc_meta = sc_meta,
    spatial_count = spatial_counts,
    spatial_location = spatial_locations,
    ct.varname = "cell_subclass",
    ct.select = NULL,
    sample.varname = 'orig.ident',
    minCountGene = 100,
    minCountSpot = 5) 
## ## QC on scRNASeq dataset! ...
## No cell types selected, we will use all the cell types in the scRNA-seq data
## ## QC on spatially-resolved dataset! ...

Run deconvolution

CARD_obj = CARD_deconvolution(CARD_object = CARD_obj)
## ## create reference matrix from scRNASeq...
## Warning in asMethod(object): sparse->dense coercion: allocating vector of size
## 6.0 GiB
## ## Select Informative Genes! ...
## ## Deconvolution Starts! ...
## ## Deconvolution Finish! ...

Now we can visualize the results using CARD’s visiualize.pie function, which plots a small pie chart for each spot in the capture slide.

p1 <- CARD.visualize.pie(proportion = CARD_obj@Proportion_CARD,spatial_location = CARD_obj@spatial_location)
p1

This can be a little hard to read, so we can subset for specific cells. For each cell, we are plotting the proportion of each cell type in each capture slide spot.

ct.visualize = c("Astro", "L4", "Oligo")
p2 <- CARD.visualize.prop(
    proportion = CARD_obj@Proportion_CARD,        
    spatial_location = CARD_obj@spatial_location, 
    ct.visualize = ct.visualize,                
    colors = c("lightblue","lightyellow","red"), 
    NumCols = 3) 
CARD proportion plot.

CARD proportion plot.

The CARD tutorial shows further visualizations, including refining the resolution of these plots.

https://yingma0107.github.io/CARD/documentation/04_CARD_Example.html